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Abstract. 

The human alphacoronaviruses HCoV-NL63 and HCoV-229E are commonly associated with upper respiratory tract 
infections (URTI). Information on their molecular epidemiology and evolutionary dynamics in the tropical region of 
southeast Asia however is limited. Here, we analyzed the phylogenetic, temporal distribution, population history, 
and clinical manifestations among patients infected with HCoV-NL63 and HCoV-229E. Nasopharyngeal swabs 
were collected from 2,060 consenting adults presented with acute URTI symptoms in Kuala Lumpur, Malaysia, 
between 2012 and 2013. The presence of HCoV-NL63 and HCoV-229E was detected using multiplex polymerase 
chain reaction (PCR). The spike glycoprotein, nucleocapsid, and la genes were sequenced for phylogenetic 
reconstruction and Bayesian coalescent inference. A total of 68/2,060 (3.3%) subjects were positive for human 
alphacoronavirus; HCoV-NL63 and HCoV-229E were detected in 45 (2.2%) and 23 (1.1%) patients, respectively. A 
peak in the number of HCoV-NL63 infections was recorded between June and October 2012. Phylogenetic 
inference revealed that 62.8% of HCoV-NL63 infections belonged to genotype B, 37.2% was genotype C, while all 
HCoV-229E sequences were clustered within group 4. Molecular dating analysis indicated that the origin of HCoV- 
NL63 was dated to 1921, before it diverged into genotype A (1975), genotype B (1996), and genotype C (2003). The 
root of the HCoV-229E tree was dated to 1955, before it diverged into groups 1-4 between the 1970s and 1990s. 

The study described the seasonality, molecular diversity, and evolutionary dynamics of human alphacoronavirus 
infections in a tropical region. 


INTRODUCTION 

Human coronaviruses were first reported in the mid-1960s and are known to be associated 
with acute upper respiratory tract infections (URTI) or the common cold. According to the 
International Committee for Taxonomy of Viruses , human coronavirus NL63 (HCoV-NL63) 
and 229E (HCoV-229E) belong to the alphacoronavirus genus, a member of the Coronaviridae 
family. Coronaviruses are positive-strand RNA viruses with the largest genome of approximately 
27-31 kb in size. 4 In previous studies, analysis of the spike (5) glycoprotein, nucleocapsid (AO, 
and la genes of HCoV-NL63 and HCoV-229E revealed evidence of genetic recombination, 
genetic drift, and positive selection events as part of the evolution of the virus. 5 ' 6 
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Phylogenetically, HCoV-NL63 and HCoV-229E are more closely related to each other than to 

7 

any other human coronavirus. 

HCoV-NL63 and HCoV-229E account for about 5% of all acute URTI, 7-9 and in some cases, 
a small proportion of infections are associated with hospital admission. 10,11 URTI symptoms such 
as cough and sore throat are often observed in patients infected with either HCoV-NL63 or 
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HCoV-229E. The prevalence of HCoV-NL63 varies from one study to another; however, in 
most temperate and tropical countries, it appears to peak around September-April, whereas 
HCoV-229E is usually detected at low rates throughout the year. 14-16 In spite of the clinical 
importance of HCoV infections, 17 the prevalence, seasonality, clinical, and phylogenetic 
characteristics of HCoVs remain mostly unreported from the tropical region of southeast Asia. 

On the basis of the 5, N, and la genes of the HCoV-NL63 and HCoV-229E sequences from 
Malaysia and also worldwide, we describe the genetic history and phylodynamic profiles of both 
human alphacoronaviruses using a set of phylogenetic tools. 

MATERIALS AND METHODS 


Ethics statement. 

The study was approved by the University of Malaya Medical Ethics Committee 
(MEC890.1). Standard, multilingual consent forms permitted by the Medical Ethics Committee 
were used. Written consent was obtained from all study participants. 

Clinical specimens. 

A total of 2,060 consenting outpatients who presented with acute URTI symptoms were 
recruited at the Primary Care Clinic of University Malaya Medical Center in Kuala Lumpur, 
Malaysia, between March 2012 and February 2013. Demographic data such as age, gender, and 
ethnicity were acquired before the collection of nasopharyngeal swabs. The severity of the URTI 
symptoms (sneezing, nasal discharge, nasal congestion, headache, sore throat, voice hoarseness, 
muscle ache, and cough) was graded according to criteria described earlier. “ The 
nasopharyngeal swabs were transferred to the laboratory in universal transport media and stored 
at -80°C. 

Molecular detection of HCoV-NL63 and HCoV-229E. 

Extraction of total nucleic acids from the nasopharyngeal swabs was carried out using the 
magnetic bead-based protocols applied in the NucliSENS easyMAG automated nucleic acid 
extraction system (BioMerieux). ’ The presence of respiratory viruses in specimens was 
examined using the xTAG Respiratory Virus Panel FAST multiplex reverse transcriptase 
polymerase chain reaction (RT-PCR) assay (Luminex Molecular Diagnostics), which can 
identify HCoV-NL63, HCoV-229E, HCoV-OC43, HCoV-HKUl, and other respiratory viruses 
and subtypes. 24 

Genetic analysis of HCoV-NL63 and HCoV-229E. 

Gene fragment sequencing of the S (SI domain), complete N, and partial la (nsp3) genes was 
performed for HCoV-NL63 and HCoV-229E specimens. The SI is a highly variable receptor¬ 
binding domain, whereas the N and nsp3 are conserved regions within the coronavirus genome, 
and these three regions are therefore efficiently used for genotyping. 5,6 Viral RNA was reverse 



transcribed into complementary DNA (cDNA) using the Superscript III kit (Invitrogen) with 
random hexamers (Applied Biosystems). The partial S gene (SI domain) (HCoV-NL63: 1,383 nt 
[20,413-21,796] and HCoV-229E: 855 nt [20,819-21,674]), complete N gene (HCoV-NL63: 
1,133 nt [26,133-27,266] and HCoV-229E: 1,330 nt [25,673-27,003]), and partial la (nsp3) 
gene (HCoV-NL63: 781 nt [5,811-6,592] and HCoV-229E: 766 nt [5,898-6,664]) were 
amplified through PCR using 10 pM of newly designed or previously published primers listed in 
Table 1. The PCR mixture (25 pL) contained cDNA, PCR buffer (10 mM Tris-HCl [pH 8.3], 50 
mM KC1, 3 mM MgCl, and 0.01% gelatin), 100 pM (each) deoxynucleoside triphosphates, Hi- 
Spec additive and 4 U/pL BIO-X-ACT Short DNA polymerase (BioLine). The cycling 
conditions were as follows: initial denaturation at 95°C for 5 minutes followed by 40 cycles of 
94°C for 1 minute, 54.5°C for 1 minute, 72°C for 1 minute, and a final extension at 72°C for 10 
minutes. PCR reactions were performed in a C1000 Touch automated thermal cycler (Bio-Rad). 
Nested/semi-nested PCR was performed if necessary, under the same cycling conditions at 30 
cycles. Purified PCR products were sequenced using the ABI PRISM 3730XL DNA Analyzer 
(Applied Biosystems). The nucleotide sequences were codon aligned with relevant complete and 
partial HCoV-NL63 and HCoV-229E reference sequences retrieved from the GenBank. 5 ' 6 ' 2831 

Maximum clade credibility (MCC) trees for the partial S (SI domain), complete N, and 
partial la (nsp3) genes were reconstructed in BEAST (version 1.7). 32 MCC trees were produced 
using a relaxed molecular clock, assuming uncorrelated lognormal distribution under the general 
time-reversible nucleotide substitution model with a proportion of invariant sites (GTR+I) and a 
constant coalescent/exponential tree model. The Markov chain Monte Carlo run was set at 6 x 
10 6 steps long sampled every 10,000 state. The trees were annotated using Tree Annotator 
program included in the BEAST package, after a 10% burn-in, and visualized in FigTree 
(version 1.3.1). The evolutionary history and divergence time (in calendar year) for the HCoV- 
NL63 and HCoV-229E genotypes were also assessed. The mean divergence time and the 95% 
highest posterior density regions were evaluated. The best-fitting model was determined by the 
Bayes factor using marginal likelihood analysis implemented in Tracer (version 1.5). 32 The 
substitution rate of 3.3 x 10 4 substitutions/site/year for the S gene of human alphacoronavirus 
estimated previously was used for the divergence time inference. 5 

Maximum likelihood (ML) phylogenetic trees were also reconstructed for the three regions in 
the phylogenetic analysis using parsimony (PAUP 4.0) software, 34 with a Hasegawa-Kishino- 
Yano nucleotide substitution model plus discrete gamma categories. The statistical robustness 
and reliability of the branching orders were evaluated by a bootstrap analysis of 1,000 replicates. 
To investigate the genetic relatedness among the HCoV-NL63 and HCoV-229E genotypes, inter- 
genotype pairwise nucleotide distances were estimated for the S gene using MEGA 5.1. Such 
analysis was not implemented for the N and la genes due to their high genetic invariability 
across HCoV-NL63 and HCoV-229E genotypes. 5 ' 6 

Statistical analysis. 

All categorical variables were analyzed using the two-tailed Fisher’s exact tcst/%“ test by the 
Statistical Package for the Social Sciences (release 16.0; IBM Corp., Chicago, IL). P values < 
0.05 were considered significant. 


Nucleotide sequences. 

HCoV-NL63 and HCoV-229E nucleotide sequences produced in the study have been 
deposited in GenBank under the accession nos. KT359730-KT359913. 

RESULTS AND DISCUSSION 

Detection of HCoV-NL63 and HCoV-229E in nasopharyngeal swabs. 

In the current cross-sectional study, a total of 2,060 nasopharyngeal swab specimens 
collected from Kuala Lumpur, Malaysia, throughout a 12-month study period (March 2012 to 
February 2013), were screened for the presence of HCoV-NL63 and HCoV-229E using the 
multiplex RT-PCR method, as an alternative approach to other detection methods such as cell 
culture. 36 Human alphacoronavirus was identified in 68 (3.3%) subjects; HCoV-NL63 and 
HCoV-229E were detected in 45/2,060 (2.2%) and 23/2,060 (1.1%) patients, respectively. These 
findings are consistent with the global average prevalence of human alphacoronavirus, which 
ranges between 1% and 10%, with HCoV-229E generally detected at lower rates than HCoV- 
NL63. 8-10 ’ 27 ’ 37-40 In contrast to an earlier study, 41 no coinfection of alpha- and betacoronavirus 
(HCoV-OC43 and HCoV-HKUl) was observed within an individual. Age, gender, and ethnicity 
of the patients were summarized in Table 2. A peak in the number of HCoV-NL63 infections 
was recorded for the period between June and October 2012, although the number of patients 
with URTI symptoms screened during those months was relatively low (Figure 1). This pattern 
of virus prevalence corroborates with that observed in neighboring country Thailand, in which a 
peak of HCoV-NL63 incidence was recorded in September. 14 In contrast, studies from temperate 
regions commonly reported a higher prevalence of HCoV-NL63 during winter seasons. 7-9 ’ 42 
However, the number of HCoV-229E infections detected in Malaysia was low, with no 
significant peak observed throughout the year, similar to other studies reported worldwide. 14 ’ 38 ' 43 
It is important to note that the study was performed in a relatively short duration, therefore 
limiting the epidemiological and disease trend comparison with reports from other countries. 

Phylogenetic analysis of the S, N, and la genes. 

A total of 42/45 (93.3%) partial S (SI domain) and 43/45 (95.6%) of each complete A and 
partial la (nsp3) genes were successfully sequenced from HCoV-NL63 specimens. 

Amplification of these genes was difficult for two xTAG-positive HCoV-NL63 specimens, 
possibly due to their low viral copy number. Phylogenetic analysis of HCoV-NL63 (Figure 2 and 
Supplemental Figure 1) showed that 27 subjects (27/43, 62.8%) in the study belonged to 
genotype B (supported by a posterior probability of 1.0 and bootstrap value of 100% at the 
internal nodes of the MCC and ML trees of the S gene, respectively, with an intra-group pairwise 
genetic distance of 0.6% ±0.1%) together with previously reported sequences from the United 
States, Europe, and Asia. 5 ’ 25 ’ 28 ’ 29 Another 16 subjects (16/43, 37.2%) were found to be grouped 
under genotype C (supported by a posterior probability of 1.0 and bootstrap value of 67% at the 
internal nodes of the MCC and ML trees of the S gene, respectively, with an intra-group pairwise 
genetic distance of 0.2% ± 0.1%) with recently described global sequences. ’ ’ Discordance in 
phylogenetic clustering among the S, N, and la genes of the HCoV-NL63 Malaysian sequences 
had been observed (Supplemental Figure 1). On the basis of the S (Sldomain) gene analysis, 26 
Malaysian strains (26/42; 61.9%) belong to genotype B while another 16 Malaysian strains 
(16/42; 38.1%) were classified within genotype C. In contrast, sequences of the three HCoV- 
NL63 genotypes (A, B, and C) appear to be intermingled in the N and la phylogenetic trees. 



Such discordance was similarly reported in earlier studies where it was confirmed that such 
phylogenetic pattern was resulted from multiple recombination events along the HCoV-NL63 
genome, in addition to the fact that the SI region sequenced in this study is considered the most 
variable along the genome, while the N and la (nsp3) genes are too conserved. 5 To estimate the 
genetic diversity between HCoV-NL63 genotypes A, B, and C, inter-genotype pairwise genetic 
distance was assessed for the S gene (Table 3). Genetic distances between genotypes A versus B 
and B versus C were high (more than 5.0%), compared with that between genotypes A versus C, 
which was at 2.1%. This is consistent with the phylogenetic tree topology in which genotypes A 
and C were more closely related and probably shared a common ancestor. 

At least one gene ( S , N, and/or la) was successfully sequenced from 23 positively tested 
HCoV-229E specimens (16, 18, and 22 of HCoV-229E S, N , and la genes, respectively). 
Phylogenetic analysis revealed that all of the HCoV-229E sequences obtained in this study were 
classified with group 4, which includes isolates that have been globally circulating since 2001 
(Figure 3 and Supplemental Figure 2). 6,30,31 The group was supported by a posterior probability 
of 1.0 and bootstrap value of 100% at the internal nodes of the MCC and ML trees of the S gene, 
respectively, with an intra-group pairwise genetic distance of 0.3% ± 0.1%. Such phylogenetic 
data were comparable to those obtained from the N tree, resulted from the hot substitution spots 
in the SI and N regions of the HCoV-229E genome. 30 The four HCoV-229E groups could not be 
clearly defined within the la gene tree because of the limited number of reference sequences 
available in the public database (Supplemental Figure 2). Inter-genotype pairwise genetic 
distance was generally low (below 5.0%) in the S gene among groups 1-4 (Table 3). 

Estimation of divergence times. 

The molecular clock analysis of HCoV-NL63 and HCoV-229E was performed using the 
coalescent-based Bayesian relaxed molecular clock under the constant and exponential tree 
models (Figures 2 and 3). The mean evolutionary rates for the S gene of HCoV-NL63 and 
HCoV-229E were newly estimated based on the constant tree model at 4.3 x 10 4 (2.3-6.7 x 
10 4 ) and 3.9 x 10 4 (1.3-6.4 x 10 4 ) substitutions/site/year, respectively. These results were 
similar to the previously reported substitution rate of the alphacoronavirus S gene (3.3 x 10 4 
substitutions/site/year). 5 The evolutionary analysis indicated that the time of the most recent 
common ancestor (tMRCA) of HCoV-NL63 was dated back to the 1920s, while the estimated 
divergence time of genotype A was dated to 1975, followed by genotype B around 1996 and 
genotype C in 2003 (Figure 2). Furthermore, the divergence time of HCoV-229E (Figure 3) was 
estimated around 1955 while the tMRCA of group 1 diverged in 1976, followed by that of group 
2 in 1981, group 3 in 1989, and group 4 in 1996. The appearance of groups 1-4 in a timely 
ordered manner would give strength to the earlier reported hypothesis that positive selection and 
genetic drift play a major role in the evolution of HCoV-229E. 6 ' 30 To the best of our knowledge, 
this is the first study that reported the divergence times of human alphacoronavirus genotypes. In 
addition, the most recently reported HCoV-229E strains (between 2001 and 2013) from major 
parts of the world belong to group 4. In accordance with earlier studies, genotype replacement is 
evident within HCoV-229E, although sampling bias may also influence the results. 6,30 Bayes 
factor analysis showed insignificant differences (Bayes factor less than 3.0) between the constant 
and exponential coalescent models of demographic analysis, in which the divergence times 
estimated using the constant coalescent tree model were similar to those calculated using the 
exponential model (Supplemental Table 1). 



Clinical symptoms assessment. 

Clinical findings of the URTI symptoms (sneezing, nasal discharge, nasal congestion, 
headache, sore throat, hoarseness of voice, muscle ache, and cough) and their severity levels 
(none, moderate, and severe) were analyzed using the two-tailed Fisher’s exact test. The 
association between symptom severity and HCoV-NL63/HCoV-229E infection was insignificant 
(P values > 0.05) (Supplemental Table 2). In line with previous clinical studies, 10 ’ 44 ' 45 the 
majority of patients infected with HCoV-NL63 and HCoV-229E presented with at least one 
respiratory symptom that was moderately severe. 

In summary, this study provides insight into the phylogeny and evolution of the HCoV-NL63 
and HCoV-2293E genotypes. Genetic characterization of human alphacoronavirus isolates 
currently circulating in Malaysia indicates the circulation of globally prevalent genotypes in the 
tropical region of southeast Asia. This study has detailed the genetic history of HCoV-NL63 and 
HCoV-229E genotypes. Since alphacoronavirus evolve through recombination, positive 
selection, and genetic drift events, continuous molecular surveillance of human alphacoronavirus 
is warranted to keep track on the evolution of the virus in southeast Asia. 
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FIGURE 1. Annual distribution of HCoV-NL63 and HCoV-229E among adults with acute upper respiratory tract 
infections in Kuala Lumpur, Malaysia. The total number of nasopharyngeal swabs screened and the monthly 
distribution of HCoV-NL63 and HCoV-229E between March 2012 and February 2013 were presented. 


FIGURE 2. Maximum clade credibility tree of HCoV-NL63. Spike gene (SI domain) sequences (1,383 nt) were 
analyzed under the relaxed molecular clock with a GTR+I substitution model and a constant size coalescent model 
implemented in BEAST. Posterior probability values and the estimation of the time of the most recent common 
ancestors with 95% highest posterior density were indicated on major nodes. The HCoV-NL63 sequences obtained 
in this study were color coded and HCoV-NL63 genotypes A-C were indicated; green = genotype A, blue = 
genotype B, and red = genotype C. The recombinant genotype is indicated by purple color. The sampling site for 
each sequence was indicated by codes for the representation of countries. Country codes are as follows; MY = 
Malaysia; US = United States; JP = Japan; NL = Netherlands; CN = China. This figure appears in color at 
www.ajtmh.org. 


FIGURE 3. Maximum clade credibility tree of FlCoV-229E. Spike gene (SI domain) sequences (855 nt) were 
analyzed under the relaxed molecular clock with a GTR+I substitution model and a constant size coalescent model 
implemented in BEAST. Posterior probability values and the estimation of the time to the most recent common 
ancestors with 95% highest posterior density were indicated on the major nodes. The FlCoV-229E sequences 
obtained in this study were color coded and the HCoV-229E groups 1—4 were indicated green = genotype 1, red = 
genotype 2, blue = genotype 3, and purple = genotype 4. The sampling site for each sequence was indicated by 
codes for the representation of countries. Country codes are as follows; MY = Malaysia; US = United States; JP = 
Japan; NL = Netherlands; CN = China; AU = Australia; IT = Italy. This figure appears in color at www.ajtmh.org. 



Table 1 


Polymerase chain reaction primers for HCoV-NL63 and HCoV-229E 


I Target gene 

HCoV 

Primer 

Location* 

Sequence (5—3') 

Reference 

Spike (5) 

NL63 

SP1F 

20.390-20,412 

Forward: TGAGTTTGATTAAGAGTGGTAGG 

25 

SP2F 

20,397-20,418 

Forward (nested): GATTAAGAGTGGTAGGTTGTTG 

25 

SP1R 

21,809-21,828 

Reverse: CAAACTGCAAGTGCTCACAC 

25 

SP2R 

21,797-21,816 

Reverse (nested): GCTCACACTGCAACTTTTCA 

25 

229E 

LPS1 

20,732-20,751 

Forward: AATAATTGGTTCCTTCTAAC 

26 

JH1 

20,797-20,818 

Forward (nested): TTTGTTGCTTAATTGCTTATGG 

26 

LPR 

21,710-21,728 

Reverse: AACATACACTGCCAAATTT 

This study 

JH2 

21,675-21,694 

Reverse (nested): TTTGCCAAAAGAAAAAGGGC 

26 

Nucleocapsid (AO 

NL63 and 
229E 

aN-F 

26,102-26,127 

Forward: ARRTTGCTTCATTTWWTCTAA 

This study 


25,652-25,672 



aN-Fn 

26,112-26,132 

Forward (nested): ATTTWWTCTAAACTAAACRAA 

This study 

NL63 

NL-NR 

27,278-27,299 

Reverse: ATAATAAACAKTCAACTGGAAT 

This study 

NL-NRn 

27,267-27,287 

Reverse (nested): CAACTGGAATTACAAAACAAT 

This study 

229E 

E-NR 

27,046-27,063 

Reverse: GATCCTTGTCAAGCCAAA 

This study 

E-NRn 

26,882-26,900 

Reverse (nested): AAAATTCCAACTAAAGCCT 

This study 

la 

NL63 

SS5852-5Pf 

5,778-5,798 

Forward: CTTTTGATAACGGTCACTATG 

27 

P3E2-5Pf 

5,789-5,810 

Forward (semi-nested): GGTCACTATGTAGTTTATGATG 

27 

NL-laR 

6,593-6,616 

Reverse: CTCATTACATAAAACATCRAACGG 

This study 

229E 

E-laF 

5,865-5,585 

Forward: CTGTTGAYAAAGGTCATTATA 

This study 

E-laFn 

5,876-5,897 

Forward (semi-nested): GGTCATTATACTGTTTATGAYA 

This study 

E-laR 

6,665-6,688 

Reverse: TTC ATC AC A A AT A AC ATC A A ATGG 

This study 


* Nucleotide location was determined based on the HCoV-NL63 (NC_005831) and HCoV-229E (NC_002645) 
reference sequences. 




Table 2 


Demographic data of 68 adult outpatients infected with human alphacoronavirus in Kuala Lumpur, Malaysia, 2012- 


2013 


Factor 

HCoV-NL63 (IV = 45) 

HCoV-229E (N = 23) 

P value 

Gender 


Male 

25 (55.6%) 

12 (52.2%) 

0.80 

Female 

20 (44.4%) 

11 (47.8%) 


Age 


<40 

13 (28.9%) 

7 (30.4%) 

0.45 

40-60 

10 (22.2%) 

8 (34.8%) 

>60 

22 (48.9%) 

8 (34.8%) 


Symptoms 


Sneezing 

42 (93.3%) 

20 (87.0%) 


Nasal discharge 

38 (84.4%) 

19 (82.6%) 


Nasal congestion 

29 (64.4%) 

15 (65.2%) 


Headache 

23 (51.1%) 

13 (56.5%) 

0.99 

Sore throat 

32 (68.9%) 

14 (60.9%) 


Hoarseness of voice 

35 (77.8%) 

15 (65.2%) 


Muscle ache 

27 (60.0%) 

16 (69.6%) 


Cough 

43 (95.6%) 

20 (87.0%) 


Ethnicity 


Malay 

11 (24.5%) 

5 (21.8%) 

0.08 

Chinese 

24 (53.3%) 

7 (30.4%) 

Indian 

10 (22.2%) 

11 (47.8%) 



Table 3 


The genetic diversity among alphacoronavirus genotypes in the spike gene 


HCoV-NL63 

A 

B 

C 


A 

- 

0.8 

0.5 


B 

7.6 

- 

0.6 


C 

2.1 

6.7 

- 


HCoV-229E 

1 

2 

3 

4 

1 

- 

0.4 

0.6 

0.7 

2 

1.5 

- 

0.3 

0.4 

3 

2.5 

1.2 

- 

0.3 

4 

3.5 

2.6 

1.5 

- 


* Pairwise genetic distances are expressed in percentage (%) of nucleotide difference. 

| Standard error estimates of the mean genetic distances are shown in the upper diagonal. 






Supplemental Figure 1. Phylogenetic analysis of the HCoV-NL63 spike, nucleocapsid. and la genes. The partial 
spike (SI) (1,383 nt), complete nucleocapsid (1,133 nt), and partial la (nsp3) (781 nt) maximum likelihood trees 
were constructed using the Hasegawa-Kishino-Yano nucleotide substitution model and gamma distribution plus 
discrete gamma categories in phylogenetic analysis using parsimony. The HCoV-NL63 strains obtained from this 
study were color coded and the HCoV-NL63 genotypes A-C were indicated; green = genotype A, blue = genotype 
B, and red = genotype C. The recombinant genotype is indicated by purple color. Scale bars indicating genetic 
distance (in nucleotide substitutions per site) are shown. Each HCoV-NL63 sequence was assigned to its genotype 
based on the SI phylogenetic analysis. Country codes are as follows; MY = Malaysia; US = United States; JP = 
Japan; NL = Netherlands; CN = China. 


Supplemental Figure 2. Phylogenetic analysis of the HCoV-229E spike, nucleocapsid, and la genes. The partial 
spike (SI) (855 nt), complete nucleocapsid (1,330 nt), and partial la (nsp3) (766 nt) maximum-likelihood trees were 
constructed using the Hasegawa-Kishino-Yano nucleotide substitution model and gamma distribution plus discrete 
gamma categories in phylogenetic analysis using parsimony. The HCoV-229E strains obtained from this study were 
color coded and the HCoV-229E groups 1-4 were indicated; green = genotype 1, red = genotype 2, blue = genotype 
3, and purple = genotype 4. Scale bars indicating genetic distance (in nucleotide substitutions per site) are shown. 
Each HCoV-229E sequence was assigned to its genotype based on the SI phylogenetic analysis. Country codes are 
as follows; MY = Malaysia; US = United States; JP = Japan; NL = Netherlands; CN = China; AU = Australia; IT = 
Italy. 


Supplemental Table 1 


Evolutionary characteristics of HCoV-NL63 and HCoV-229E genotypes 


Subtype-gene evolutionary rate* 

Genotype 

tMRCAt 

NL63-Spike 4.3 x HT 4 (2.1 - 6.6 x KT 4 ) 


All genotypes 

1,902.2(1,805.4-1,974.4) 


Genotype A 

1,973.9(1,961.2-1,983.8) 


Genotype B 

1,995.6(1,989.7-2,000.2) 


Genotype C 

2,003.0(1,998.6-2,006.5) 

229E-Spike 3.9 x 10“ 4 (1.3 - 6.5 x 10“ 4 ) 


All groups 

1,956.8 (1,948.4-1,962.0) 


Group 1 

1,976.6(1,973.7-1,978.9) 


Group 2 

1,981.1 (1,979.6-1,982.0) 


Group 3 

1,989.0(1,987.4-1,990.0) 


Group 4 

1,996.3 (1,993.0-1,999.0) 


* Estimated mean rates of evolution expressed as 10 nucleotide substitutions/site/year under a relaxed molecular 
clock with GTR+I substitution model and an exponential tree model. The 95% highest posterior density (HPD) 
confidence intervals are included in parentheses. 


| Mean time of the most common ancestor (tMRCA, in calendar year). The 95% highest posterior density 
confidence intervals are indicated. 






Supplemental Table 2 


Comparison of upper respiratory tract infection symptoms severities between patients infected with HCoV-NL63 


and HCoV-229E 


Symptom 

Severity level 

HCoV-NL63 

HCoV-229E 

P value* 

Sneezing 

None 

3 

3 

0.472 

Moderate 

35 

15 

Severe 

7 

5 

Nasal discharge 

None 

7 

4 

0.051 

Moderate 

33 

11 

Severe 

5 

8 

Nasal congestion 

None 

16 

8 

0.727 

Moderate 

24 

11 

Severe 

5 

4 

Headache 

None 

22 

10 

0.696 

Moderate 

17 

8 

Severe 

6 

5 

Sore throat 

None 

12 

9 

0.269 

Moderate 

26 

9 

Severe 

6 

5 

Hoarseness of voice 

None 

10 

8 

0.172 

Moderate 

34 

13 

Severe 

1 

2 

Muscle ache 

None 

17 

7 

0.252 

Moderate 

20 

15 

Severe 

7 

1 

Cough 

None 

2 

3 

0.477 

Moderate 

32 

15 

Severe 

11 

5 


* P values < 0.05 represent significant results. 
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Supplementary Figure 2 
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